AITopics | atypical speech

Collaborating Authors

atypical speech

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Affect Models Have Weak Generalizability to Atypical Speech

Narain, Jaya, Romana, Amrit, Mitra, Vikramjit, Lea, Colin, Ren, Shirley

arXiv.org Artificial IntelligenceAug-1-2025

Speech and voice conditions can alter the acoustic properties of speech, which could impact the performance of paralinguistic models for affect for people with atypical speech. We evaluate publicly available models for recognizing categorical and dimensional affect from speech on a dataset of atypical speech, comparing results to datasets of typical speech. We investigate three dimensions of speech atypicality: intelligibility, which is related to pronounciation; monopitch, which is related to prosody, and harshness, which is related to voice quality. We look at (1) distributional trends of categorical affect predictions within the dataset, (2) distributional comparisons of categorical affect predictions to similar datasets of typical speech, and (3) correlation strengths between text and speech predictions for spontaneous speech for valence and arousal. We find that the output of affect models is significantly impacted by the presence and degree of speech atypicalities. For instance, the percentage of speech predicted as sad is significantly higher for all types and grades of atypical speech when compared to similar typical speech datasets. In a preliminary investigation on improving robustness for atypical speech, we find that fine-tuning models on pseudo-labeled atypical speech data improves performance on atypical speech without impacting performance on typical speech. Our results emphasize the need for broader training and evaluation datasets for speech emotion models, and for modeling approaches that are robust to voice and speech differences.

large language model, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2504.16283

Country: North America > United States (0.14)

Genre: Research Report > New Finding (0.66)

Industry:

Health & Medicine > Therapeutic Area > Psychiatry/Psychology (0.68)
Health & Medicine > Consumer Health (0.68)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.96)
Information Technology > Artificial Intelligence > Speech (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.49)

Add feedback

Voice Quality Dimensions as Interpretable Primitives for Speaking Style for Atypical Speech and Affect

Narain, Jaya, Kowtha, Vasudha, Lea, Colin, Tooley, Lauren, Yee, Dianna, Mitra, Vikramjit, Huang, Zifang, Marques, Miquel Espi, Huang, Jon, Avendano, Carlos, Ren, Shirley

arXiv.org Artificial IntelligenceMay-29-2025

Perceptual voice quality dimensions describe key characteristics of atypical speech and other speech modulations. Here we develop and evaluate voice quality models for seven voice and speech dimensions (intelligibility, imprecise consonants, harsh voice, naturalness, monoloudness, monopitch, and breathiness). Probes were trained on the public Speech Accessibility (SAP) project dataset with 11,184 samples from 434 speakers, using embeddings from frozen pre-trained models as features. We found that our probes had both strong performance and strong generalization across speech elicitation categories in the SAP dataset. We further validated zero-shot performance on additional datasets, encompassing unseen languages and tasks: Italian atypical speech, English atypical speech, and affective speech. The strong zero-shot performance and the interpretability of results across an array of evaluations suggests the utility of using voice quality dimensions in speaking style-related tasks.

artificial intelligence, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2505.21809

Country: North America > United States > California (0.28)

Genre: Research Report > Experimental Study (0.46)

Industry: Health & Medicine > Therapeutic Area > Neurology (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Speech > Speech Recognition (0.95)
Information Technology > Artificial Intelligence > Natural Language (0.71)

Add feedback

Hypernetworks for Personalizing ASR to Atypical Speech

Müller-Eberstein, Max, Yee, Dianna, Yang, Karren, Mantena, Gautam Varma, Lea, Colin

arXiv.org Artificial IntelligenceJul-2-2024

Parameter-efficient fine-tuning (PEFT) for personalizing automatic speech recognition (ASR) has recently shown promise for adapting general population models to atypical speech. However, these approaches assume a priori knowledge of the atypical speech disorder being adapted for -- the diagnosis of which requires expert knowledge that is not always available. Even given this knowledge, data scarcity and high inter/intra-speaker variability further limit the effectiveness of traditional fine-tuning. To circumvent these challenges, we first identify the minimal set of model parameters required for ASR adaptation. Our analysis of each individual parameter's effect on adaptation performance allows us to reduce Word Error Rate (WER) by half while adapting 0.03% of all weights. Alleviating the need for cohort-specific models, we next propose the novel use of a meta-learned hypernetwork to generate highly individualized, utterance-level adaptations on-the-fly for a diverse set of atypical speech characteristics. Evaluating adaptation at the global, cohort and individual-level, we show that hypernetworks generalize better to out-of-distribution speakers, while maintaining an overall relative WER reduction of 75.2% using 0.1% of the full parameter budget.

adaptation, fine-tuning, hypernetwork, (16 more...)

arXiv.org Artificial Intelligence

2406.0424

Country:

Asia > Middle East > Jordan (0.04)
North America > United States > New York > New York County > New York City (0.04)
North America > United States > Illinois (0.04)
(2 more...)

Genre: Research Report (1.00)

Industry: Health & Medicine (0.47)

Technology:

Information Technology > Artificial Intelligence > Speech > Speech Recognition (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.34)

Add feedback

Google made an app to ease communication for people with speech impairments

EngadgetNov-9-2021, 20:30:16 GMT

For too long, people with speech impairments have struggled to be understood not only by other people, but also by voice-based technology. Though some companies have started to make their products work better for people with atypical speech, the most prevalent services still don't hear them well. Google announced today that it's made a new Android app called Project Relate that could help people with speech impairments communicate more easily with others and the Assistant. It's looking for beta testers to test and improve the app starting today. Like product manager for Google Research Julie Cattiau said in a video, "standard speech recognition doesn't always work as well for people with atypical speech because the algorithms have not been trained on samples of their speech."

atypical speech, ease communication, speech impairment, (2 more...)

Engadget

Genre: Press Release (0.61)

Technology:

Information Technology > Communications > Mobile (0.64)
Information Technology > Artificial Intelligence > Speech (0.45)

Add feedback

AI Technologies that are Reshaping Social Infrastructure

#artificialintelligenceJan-8-2020, 22:01:01 GMT

Together with the rise of the Internet, access to large repositories of data has helped machine learning technology grow exponentially. The incredibly quick pace of growth was unprecedented. As a result, it is obvious that AI will make a significant impact on the world in the years to come. However, with the numerous established and emerging fields of AI around today, such a blanket statement doesn't provide much concrete meaning. What fields and applications of AI are receiving the most investment and development?

application, recognition, reshaping social infrastructure, (10 more...)

#artificialintelligence

AI-Alerts: 2020 > 2020-01 > AAAI AI-Alert for Jan 14, 2020 (1.00)

Country:

North America (0.05)
Asia > China (0.05)
Africa > Nigeria (0.05)

Industry:

Transportation > Ground > Road (0.52)
Health & Medicine > Diagnostic Medicine (0.51)
Information Technology > Security & Privacy (0.49)
(2 more...)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (0.94)
Information Technology > Artificial Intelligence > Vision (0.78)
Information Technology > Artificial Intelligence > Representation & Reasoning > Personal Assistant Systems (0.50)
(4 more...)

Add feedback